Skip to main content

ArchiveBot: The Archive Team Crowdsourced Crawler

ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites).

To use ArchiveBot, drop by #archivebot on EFNet. To interact with ArchiveBot, you issue commands by typing it into the channel. Note you will need channel operator permissions in order to issue archiving jobs. The dashboard shows the sites being downloaded currently.

There is a dashboard running for the archivebot process at http://www.archivebot.com.

ArchiveBot's source code can be found at https://github.com/ArchiveTeam/ArchiveBot.

1,083
RESULTS


web 1,083

PART OF
Archive Team
Web Crawls

TOPIC
foseti.wordpress.com 2
itunes.apple.com 2
twitter.com 2
184.180.244.41 1
3dblogger.typepad.com 1
ahkscript.org 1
approachingaro.org 1
arstechnica.com 1
ay-riders.speccy.cz 1
becominggaia.wordpress.com 1
bitcointalk.org 1
blakemasters.com 1
blog.42floors.com 1
blog.coinjar.io 1
blog.dispatch.cc 1
blog.do.com 1
blog.dopplr.com 1
blog.ioactive.com 1
blog.lazymeter.com 1
blog.muflax.com 1
blog.zapd.com 1
blogs.hbr.org 1
boringasheck.com 1
breadlabs.tumblr.com 1
buddhism-for-vampires.com 1
coinjar.io 1
commonsware.com 1
cursiveclojure.com 1
daily.muflax.com 1
darkmail.info 1
darkpatterns.org 1
developer.leapmotion.com 1
developers.seagate.com 1
do.com 1
fanart.lionking.org 1
files.chatnfiles.com 1
forsythia.net 1
forum.reverse4you.org 1
forums.eventscripts.com 1
forums.spiderbasic.com 1
forums.unrealengine.com 1
forums.vandyke.com 1
framebase.io 1
freeweibo.com 1
garyfung.ca 1
gospel.muflax.com 1
help.do.com 1
hintsforums.macworld.com 1
horseecomics.tumblr.com 1
ilovelilychouchou.com 1
irclog.perlgeek.de 1
isohunt.com 1
kennethfolkdharma.com 1
labs.enigma.io 1
ludios.org 1
malmen.org 1
math.eretrandre.org 1
mathematicalmulticore.wordpress.com 1
medium.com 1
mitchell.jp 1
mrlandsberry.weebly.com 1
msram.github.io 1
muflax.com 1
nakkaya.com 1
nathanic.org 1
ootbcomp.com 1
personal.inet.fi 1
plus.google.com 1
privacy.cryptoseal.com 1
sachachua.com 1
security.mongohq.com 1
sgvo.homestead.com 1
sharknet.us 1
sinocism.com 1
slatestarcodex.com 1
softwarethis.com 1
southforksecurity.com 1
springfieldfiles.com 1
storify.com 1
support.doctape.com 1
support.google.com 1
support.leapmotion.com 1
tekpub.com 1
thenewinquiry.com 1
theprofoundprogrammer.com 1
therafirmata.webklik.nl 1
thesmokinggun.com 1
thinkprogress.org 1
tug.org 1
upl-gravedigger.boo.jp 1
uproxy.org 1
us2.campaign-archive1.com 1
usemycomputer.com 1
utopia.duth.gr 1
venturebeat.com 1
vgoulet.act.ulaval.ca 1
vndb.org 1
warhammeronline.com 1
wekeroad.com 1
windowsmoviemakers.net 1
wristifyme.com 1
www.animenewsnetwork.com 1
www.cs.carleton.edu 1
www.devttys0.com 1
www.dopplr.com 1
www.eventbrite.com 1
www.facebook.com 1
www.fullscreenmario.com 1
www.getairo.com 1
www.gwern.net 1
www.healthcare.gov 1
www.interfluidity.com 1
www.kanjitomo.net 1
www.killers4hire.com 1
www.loper-os.org 1
www.magicthegatheringtactics.com 1
www.mcfallweb.com 1
www.microcodeconsulting.com 1
www.practicefusion.com 1
www.reddit.com 1
www.righto.com 1
www.segagagadomain.com 1
www.solarus-games.org 1
www.trunker.info 1
www.unitrunker.com 1
www.urbit.org 1
www.warhammeronline.com 1
www.washingtonpost.com 1
www.xoxohth.com 1
www.zapd.com 1
www.zhoutong.com 1
x.rubini.us 1
xkcdexplained.tumblr.com 1
zuckerbergfiles.org 1
SHOW DETAILS
Title
Date Archived
Creator
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
40,926
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
36,643
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
Topics: 3dblogger.typepad.com, ahkscript.org, approachingaro.org, arstechnica.com,...
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
24,503
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
24,213
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
22,343
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
22,028
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
18,943
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
18,223
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
16,111
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
15,830
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
15,234
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,804
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,737
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,274
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,148
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,125
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.